Service framework for a distributed object network system

ABSTRACT

An improved method and apparatus for providing a service framework for a distributed object network system are provided. In some embodiments, an apparatus that includes a server, a service for a limited resource residing on the server, and a pool of workers for the service that execute service requests from a client in a distributed object network system is provided. In some embodiments, a method that includes providing client-side service request encapsulation, balancing workloads among clones of service locators, clones of services, and workers in a worker pool of a service, and improving fault tolerance in a distributed object network system is provided.

FIELD OF THE INVENTION

[0001] This invention relates to programmed computers and, moreparticularly, to an improved method and apparatus for providing aservice framework for a distributed object network system.

BACKGROUND

[0002] Computer systems that provide users access to limited resourcesare well known. For example, a client-server system represents a commonparadigm for providing shared access to a limited resource such as acomputer database on a server. The typical client-server system includesa computer (the “server”) in which one or more limited resources reside(e.g., are stored) and one or more satellite computers (the “clients”)which access the limited resources. The access is generally performedover an electronic communication system. The clients access the limitedresources on an as needed basis.

[0003] A server typically includes a computer or multiple computersconnected via an electronic communication system, services (e.g., a dataservice that provides access to a database residing in a computer or adistributed service residing in multiple computers connected via anelectronic communication system), and a storage for storing theservices. The storage typically includes some combination of randomaccess memory (“RAM”) and magnetic media, such as tapes and disks oroptical media, and other storage devices. Depending on the requirementsof the system, the server may be a personal desktop computer thatincludes a hard-disk, a large mainframe computer that includes multipletape drives, or some other kind of computer.

[0004] A client may be a personal computer, a workstation, or some otherkind of computer. A client may be either remote from the server (i.e.,the client accesses the server via an electronic communication system)or local to the server (e.g., the client accesses the server via a localbus). A client may include one or more “applications” such as a wordprocessor, a web browser, or database interface software to accessinformation from a database on a server. Some of the applications may beunder the control of a human operator and others may run automaticallyor under the control of another application.

[0005] An electronic communication system (“network”) may includecommercial telephone lines as well as dedicated communication lines tocarry data signals between the server and the client.

[0006] Prior client-server approaches allow a limited number of clientsto access limited resources residing in a server. In particular, in thetypical client-server environment, the workload characteristics arepredictable and well known because of the predetermined limit on thenumber of clients and the well known nature of the clients.

[0007] However, increasing Internet usage presents a unique problem forproviding a dramatically increasing and unpredictable number of clientsefficient and fair allocation of access to limited resources residing ina server. Accordingly, prior client-server approaches are inadequate forthe Internet environment, because the number of concurrent users in theInternet environment generally exceeds the number of concurrent users ina typical client-server environment and is generally more unpredictable.

[0008] A Common Object Request Broker Architecture (CORBA) represents apartial attempt to address the problem of providing an increasing andunpredictable number of users access to limited resources residing in aserver. CORBA provides a client/server middleware defined by an industryconsortium called the Object Management Group (OMG), which includes over700 companies representing the entire spectrum of the computer industry.

[0009] In particular, CORBA defines an implementation-independentcomponent-based middleware. CORBA allows intelligent components todiscover each other and interoperate on an object bus called an ObjectRequest Bus (ORB). CORBA objects can reside anywhere on a network.Remote clients can access a CORBA object via method invocations. Clientsdo not need to know where a CORBA object resides or on which operatingsystem the CORBA object is executed. Thus, a client can access a CORBAobject that resides in the same process or on a machine in anothercountry connected via the Internet.

[0010] Further, both the language and compiler used to create CORBAobjects are transparent to clients. For example, a CORBA object may beimplemented as a set of C++ classes, in JAVA bytecode, or in COBOL code.Thus, the implementation of the CORBA object is irrelevant to theclient. The client only needs to know the interface of the CORBA object.CORBA uses an Interface Definition Language (IDL) to define a CORBAobject's interface, and the IDL also allows for specifying a component'sattributes such as the parent classes it inherits from and the methodsits interface supports. For example, a CORBA object provides animplementation for the CORBA object's IDL interface using a language forwhich an IDL mapping exists. In particular, CORBA defines a standardmapping from the IDL to other implementation languages such as C++,JAVA, ADA, etc. A CORBA IDL compiler generates client-side stubs andserver-side skeletons for the CORBA object's IDL interface.

[0011] CORBA also specifies bus-related services for creating anddeleting objects, accessing them by name, storing them in persistentstores, externalizing their states, and defining ad hoc relationshipsbetween them. Accordingly, CORBA provides a flexible distributed-objectmiddleware that provides client-server interoperability. CORBA and JAVAare both further described in “Client/Server Programming with JAVA™ andCORBA” by Robert Orfali and Dan Harkey (John Wiley & Sons: New York,N.Y., 1997).

[0012]FIG. 1 shows a typical CORBA environment 38. A client 40 connectsto a server 54 via a network 44 (e.g., via the Internet). A client 42connects to the server 54 via a network 46 (e.g., via the Internet). Aclient 50 connects to the server 54 via a network 48 (e.g., via theInternet). CORBA provides local/remote transparency in a distributedobject network as shown in FIG. 1 by providing Internet Inter-ORBProtocol (IIOP) services 52. An ORB service represents a standard CORBAservice that can broker inter-object calls within a single process,multiple processes running within the same machine, or multipleprocesses running within different machines that may be across multiplenetworks and operating systems. For example, the client 42 includes anapplication that uses client-side stubs to obtain an object reference(e.g., a handle) to a remote CORBA object and to dispatch methodinvocations to the remote CORBA object. The communication between theclient and the server-side object uses the IIOP.

[0013] Referring to FIG. 1, the server 54 includes a relational databasemanagement systems (RDBMS) 60 (e.g., residing in a storage of the server54). The server 54 also includes a data service 56, which can beimplemented as a collection of CORBA objects, that encapsulates thelimited resource, the RDBMS 60. For example, the data service 56 mayprovide a set of operations that can execute SQL queries, storedprocedures, and perform connection management.

[0014] Referring to FIG. 1, the data service 56 can be used by standardapplications (e.g., a database interface) that reside in the clients 40,42, and 50. The clients 40, 42, and 50 obtain a handle (e.g., an objectreference) to bind to the data service 56. In particular, CORBA's objectlocation mechanism includes the CORBA client stubs which offer a bindmechanism to locate a remote object and obtain an object reference forthe remote object. Accordingly, the server 54 provides various servicessuch as the data service 56. The server 54 also includes standard CORBAsupport services in a CORBA layer 58 for activating the data service 56and administering the data service 56.

[0015] However, the standard CORBA support services do not providesignificant client-side encapsulation for requesting a service,efficient workload balancing, a substantial variety of access modes, orrobust fault tolerance. Accordingly, an improved method and apparatusfor providing a service framework for a distributed object networksystem is needed.

BRIEF DESCRIPTION

[0016] The present invention provides an improved method and apparatusfor providing a service framework for a distributed object networksystem. Accordingly, in some embodiments, the service framework includesa service proxy that encapsulates the operation of requesting a servicefrom a server.

[0017] In some embodiments, the service framework also includes a loadbalancing manager for balancing workloads among workers in a worker poolof a service. Also, the service framework may include a service locatorfor balancing workloads among clones of a service. Further, the serviceframework may include a service locator proxy for balancing workloadsamong clones of service locators that provide handles (e.g., objectreferences) to a service.

[0018] In some embodiments, the present invention is used to deployscalable applications (e.g., enterprise applications) over the WorldWide Web (WWW).

[0019] In some embodiments, a method is disclosed for providingclient-side service request encapsulation. The method may also includebalancing workloads among clones of service locators, clones ofservices, and workers in a worker pool. The method may also includeimproving fault tolerance in a distributed object network system.

[0020] Other aspects and advantages of the present invention will becomeapparent from the following detailed description and accompanyingdrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0021]FIG. 1 shows a typical CORBA environment.

[0022]FIG. 2 shows a framework model that includes a service frameworkin accordance with some embodiments of the present invention.

[0023]FIG. 3 shows a service framework for a distributed object networksystem in accordance with some embodiments of the present invention.

[0024]FIG. 4 is a flow diagram illustrating the operation of a serviceproxy in accordance with some embodiments of the present invention.

[0025]FIG. 5 shows the service proxy of the client of FIG. 3 in greaterdetail in accordance with some embodiments of the present invention.

[0026]FIG. 6 is a flow diagram illustrating the operation of the clientof FIG. 3 during a service request in accordance with some embodimentsof the present invention.

[0027]FIG. 7 shows a service object of the server of FIG. 3 inaccordance with another embodiment of the present invention.

[0028]FIG. 8 provides a call to allocateWorker on a service object inaccordance with some embodiments of the present invention.

[0029]FIG. 9 provides reservation properties in accordance with someembodiments of the present invention.

[0030]FIG. 10 provides a reservation context in accordance with someembodiments of the present invention.

[0031]FIG. 11 provides service properties in accordance with someembodiments of the present invention.

[0032]FIG. 12 provides a reservation interface of the service object ofFIG. 3 in accordance with some embodiments of the present invention.

[0033]FIG. 13 is a flow diagram illustrating the reservation revocationoperation in accordance with some embodiments of the present invention.

[0034]FIG. 14 is a flow diagram illustrating the operation of reservinga previously reserved worker in accordance with some embodiments of thepresent invention.

[0035]FIG. 15 provides a reservation revocation call back interface inaccordance with some embodiments of the present invention.

[0036]FIG. 16 provides worker properties in accordance with someembodiments of the present invention.

[0037]FIG. 17 is a flow diagram illustrating the operation of pingingworkers in accordance with some embodiments of the present invention.

[0038]FIG. 18 shows a server in accordance with another embodiment ofthe present invention.

[0039]FIG. 19 shows an out-of-process worker factory and an in-processworker factory in accordance with some embodiments of the presentinvention.

[0040]FIG. 20 shows clone factories in accordance with some embodimentsof the present invention.

[0041]FIG. 21 provides an object factory interface in accordance withsome embodiments of the present invention.

[0042]FIG. 22 provides service locator properties in accordance withsome embodiments of the present invention.

[0043]FIG. 23 provides a service locator interface in accordance withsome embodiments of the present invention.

[0044]FIG. 24 provides a load balancing manager (LBM) interface inaccordance with some embodiments of the present invention.

[0045]FIG. 25 shows a fully capable LBM in accordance with someembodiments of the present invention.

[0046] FIGS. 26A-26B are a flow diagram illustrating the call sequenceoperation in accordance with some embodiments of the present invention.

[0047]FIG. 27 shows a client wait queue and an idle queue of the LBM ofFIG. 25 in accordance with another embodiment of the present invention.

[0048]FIG. 28 shows a service object and an LBM in accordance withanother embodiment of the present invention.

[0049]FIG. 29 shows the scalability of the service framework inaccordance with some embodiments of the present invention.

[0050]FIG. 30 is a flow diagram illustrating the fault toleranceoperation for when a service object becomes unavailable in accordancewith some embodiments of the present invention.

[0051]FIG. 31 shows an administrative interface in a server inaccordance with some embodiments of the present invention.

[0052]FIG. 32 provides an interface of the administrative interface inaccordance with some embodiments of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0053] The present invention provides an improved method and apparatusfor providing a service framework for a distributed object networksystem. In particular, a client-server system in which a significantnumber of clients access distributed objects on a server (e.g., adistributed enterprise application deployed over the global Internet)would significantly benefit from this improved method and apparatus.

[0054] For example, an application (e.g., a web browser) that runs on aclient may allow the client to access a data service (e.g., a servicefor a database such as an RDBMS) that runs on a server. The data serviceoften must be made accessible to a significant and unpredictable numberof clients. Moreover, a significant number of clients may attempt tosimultaneously access the data service, but some users may have higherpriority than other users. Accordingly, a service framework shouldprovide significant client-side encapsulation for requesting a service,ensure efficient workload balancing, offer a variety of access modes,provide significant scalability, and maintain robust fault tolerance.

[0055]FIG. 2 shows a framework model 70 that includes a serviceframework 76 in accordance with some embodiments of the presentinvention. In particular, the service framework 76 is implemented on topof a Common Object Request Broker Architecture (CORBA) bus 78. Theservice framework 76 provides a platform of services which extend theCORBA support services in the CORBA bus 78. In some embodiments, theservice framework 76 includes services that support and encapsulateenterprise resources (i.e., a service represents an encapsulation of aresource) such as a data service 72 or other services 74 (e.g., an Emailservice, a document management service, etc.). For example, the dataservice 72 includes a service CORBA object and a pool of worker CORBAobjects. In some embodiments, an object represents a set of computerinstructions that can be executed by a computer.

[0056] In some embodiments, the service framework 76 provides improvedaccess to services, manages the life cycle of services, and providesadministrative capabilities to manage a group of services. In someembodiments, the service framework 76 is implemented as a collection ofobjects written in JAVA and C++, and the architecture of the serviceframework 76 is the definition of the collection of objects and theinteraction among objects in the collection. In some embodiments, theservice framework 76 supports services implemented using C++, JAVA, etc.Accordingly, the service framework 76 provides a variety of methods toaccess services, and the service framework 76 also provides anarchitecture that is scalable, modular, fault tolerant, and easilyextendible (i.e., new services are easy to plug in).

[0057] Generally, in order to access a service, a client must obtain ahandle (e.g., an object reference) to a particular service object forthe service on which the client intends to execute an operation (aservice request). In some embodiments, the service framework includes anobject (e.g., a CORBA object) called a service locator that maintainsthe name space of service instances. For example, the service locatorrepository may contain an entry for an RDBMS service instance with thename “RDBMS service”, a handle to the RDBMS service's service object,and a set of properties for the service (e.g., database typessupported). The service locator exports a service lookup interface thatprovides a findService operation. The findService operation takes aservice name and, or a set of properties for a service, and thefindService operation returns a set of service object handles that matchthe name and, or the set of properties.

[0058]FIG. 3 shows a service framework for a distributed object networksystem in accordance with some embodiments of the present invention. Aclient 80 includes a service proxy 82 residing in client storage. Aservice locator 84 includes a registry of services. Thus, the servicelocator 84 provides the location (e.g., an object reference) of aparticular service in the distributed object network system in whichthere may be a large number of services at any one time. Each serviceresiding in a server 88 is managed by a service manager (SM) 86. The SM86 performs several tasks. For example, the SM 86 starts a service andstarts a service locator. The SM 86 then registers the services underits control with all known service locators in the distributed objectnetwork system. In another embodiment, there may be more than oneservice locator as discussed further below with respect to FIG. 20, buteach service locator contains the same set of services accessible overthe entire distributed object network system. Also, services can begrouped logically, as discussed further below with respect to FIG. 18.

[0059] Referring to FIG. 3, the service locator 84 exports aregisterService method that is used by the SM 86 to register itsservices with the service locator 84. The registerService call simplyregisters a collection of services managed by a SM such as the SM 86 andthe properties of each service. However, not all properties of a servicemay be registered by the SM 86. In some embodiments, only thoseproperties that would have an impact on the clients are registered bythe SM 86. The SM 86 also periodically updates the repository of all theservices registered with the service locator 84 along with any changesin the set of services that it manages.

[0060] In some embodiments, if the SM 86 does not ping the servicelocator 84 in a predetermined period of time, the service locator 84assumes that the SM 86 and all its contained services have died andquietly removes the set of services from its repository. Also, if theparent of the service locator 84 (the service manager that launched theservice locator, for example, the SM 86) does not ping in time, theservice locator 84 assumes that the parent exited due to an error andterminates itself. These operations are part of the fault tolerancecapabilities of the service framework of the present invention.

[0061] In some embodiments, the SM 86 essentially represents anadministrative object and does not perform a direct role in executingrequests for services. For example, the SM 86 manages the life cycle andother administrative issues of the services. The SM 86 may also collectthe statistics of the services and pass the statistics to the servicelocator 84 for workload balancing purposes. These operations may beperformed automatically by the SM 86. Further, the SM 86 may beresponsible for instantiating the service locator 84 and all of theservices residing in the server 88. The SM 86 exports an administrativeinterface (as discussed further below with respect to FIG. 32) thatallows the definition of new services, bringing services up and down,and modifying the properties of the services (e.g., the number ofworkers in the worker pool of a service). In addition for each service,the SM 86 may instantiate a service object such as the service object90.

[0062] In some embodiments, the service object 90 is responsible forinstantiating a worker pool which may include workers 92, 94, and 96(e.g., the workers 92, 94, and 96 may include different workerproperties as discussed further below with respect to FIG. 16). Theservice object 90 may also be responsible for reserving a worker asdiscussed further below. The workers 92, 94, and 96 support theoperations of the service (e.g., a DataService includes operations thatprovide access to a database). The workers 92, 94, and 96 may bedistributed among different address spaces, which provides faulttolerance in the event of an abnormal failure of a particular addressspace. As shown in FIG. 3, the service object 90 manages the workers 92,94, and 96 and, in some embodiments, provides a reservation mechanismfor clients. The reservation mechanism is discussed in further detailbelow.

[0063] In some embodiments, the service proxy 82 supports the methodsexported by the IDL interfaces of the workers 92, 94, and 96. Forexample, a worker method that is encapsulated in the service proxy 82uses an object reference to the worker 92 to dispatch the worker methodto the remote worker 92 and obtain results from the worker method. Thecommunication between the service proxy 82 and the worker 92 isperformed using the IIOP.

[0064] The service framework as shown in FIG. 3 represents a collectionof objects (e.g., the service proxy 82, the service locator 84, the SM86, the service object 90, the workers 92, 94, and 96, etc.) Thearchitecture of the service framework is the definition of the objectsand the interaction mechanism between the objects.

[0065] In some embodiments, the service locator 84, the SM 86, theservice object 90, and the workers 92, 94, and 96, all represent CORBAobjects. Each of these objects exports a CORBA interface in the form ofan IDL (Interface Definition Language) interface. The client 80 includesthe service proxy 82 that can bind to the service locator 84 by name(e.g., using CORBA's object location mechanism). Other objects such asthe service object 90 and the workers 92, 94, and 96 represent transientobjects. The service proxy 82 can get a handle (e.g., an objectreference) to transient objects such as the service object 90 throughthe getService operation of the service locator 84. The communicationbetween the service proxy 82 and the service object 90 may be a directCORBA invocation over IIOP.

[0066] In some embodiments, most of the service framework is implementedin JAVA. In particular, the SM 86, the service object 90, and theservice locator 84 are all implemented in JAVA, but the service proxy 82and the workers 92, 94, and 96 are implemented in both JAVA and C++. Forexample, C++ clients can use a C++ service proxy to bind to the servicelocator 84, obtain a CORBA object reference to a service object, andreserve a worker by obtaining an object reference to a worker. Also,services can be built using C++. For example, an RDBMS service can beimplemented in C++ and use native client libraries to access thedatabase. The service proxy 82 as shown in FIG. 3 encapsulates variousoperations such as executeRequest, allocateWorker, and getService, andthese operations are further discussed below with respect to FIG. 4.

[0067] In some embodiments, the server 88 may include, for example, aSolaris™ operating system, an HP-UX™ operating system, or a Windows NT™operating system. In some embodiments, the server 88 includes a dataservice which provides a relational database system, a standard type ofcommercial data management technology.

[0068] In some embodiments, the service object 90 is a service objectfor a data service. A service is a logical term that represents anencapsulation of a resource. For example, if the resource is arelational database management system (RDBMS), then an RDBMS serviceencapsulates the RDBMS by providing, for example, a set of operationsthat can execute SQL queries, stored procedures, and perform connectionmanagement. Client applications that require access to the RDBMS can usethe RDBMS service. In some embodiments, the RDBMS service is implementedas a collection of objects that collectively perform the functions ofthe RDBMS service.

[0069] In particular, the service object 90 provides the workers 92, 94,and 96 that encapsulate a resource on the server 88. Using an RDBMSservice as an example, the workers 92, 94, and 96 each export aninterface of operations that includes executing an SQL statement or astored procedure. The workers 92, 94, and 96 also manage the connectionsto the RDBMS resource. In some embodiments, the workers 92, 94, and 96encapsulate any connection state, cache execution results, and performcursor-based lookups (e.g., maintain state on behalf of the connection).Further, in some embodiments, the workers 92, 94, and 96 are implementedas a class in C++ or JAVA that derives directly from a class of theservice framework that provides all the capabilities of the serviceframework (e.g., workload balancing, fault tolerance, etc.). Thus, theworkers 92, 94, and 96 may use the object-oriented inheritance mechanismto inherit all the capabilities of the service framework, and theworkers 92, 94, and 96 may provide an interface (e.g., an IDL interface)for export. Each worker is independent of the other workers (i.e., eachworker is unaware of the other workers). Hence, the implementation ofthe worker needs to consider the resource logic only, not the serviceframework itself.

[0070] The service object 90 (in some embodiments, there is only oneservice object per service) is responsible for handing out a handle of aworker in a worker pool (i.e., a worker handle) to clients that are,using the RDBMS example, interested in executing an SQL statement orstored procedure (i.e., interested in performing an RDBMS transaction).In some embodiments, the service object 90, in deciding which workerfrom the pool to allocate, performs workload balancing among the workersand may also offer a variety of access modes from clients (e.g.,transactional access, exclusive access, shared access, priority basedaccess, etc.) as discussed further below. The service object 90 exportsa worker reservation interface that includes allocateWorker andreleaseWorker operations (e.g., methods) to allocate a worker andrelease a previously allocated worker, respectively, as discussedfurther below with respect to FIG. 12.

[0071] In the RDBMS example, the service locator 84, which maintains thename space of service instances (i.e., the set of services registered inthe service locator of the cell, and in some embodiments, every servicein the cell is registered with the service locator, and services in adifferent cell are not registered with the service locator), arepository of service objects, service names, and a set of propertiesassociated with each service (service properties are discussed furtherbelow with respect to FIG. 11), has a local repository that contains anentry for the RDBMS service with the name “RDBMS service”, a handle tothe service object 90 for the RDBMS service, and a set of properties forthe RDBMS service. The service locator 84 exports a service lookupinterface that contains a findService operation. The findServiceoperation takes a service name and, or a set of properties for a serviceand returns a set of service object handles that match the name and, orproperties. Thus, the client 80, if it is interested in obtaining ahandle to a particular service in which it intends to execute anoperation, executes a findService operation on the service locator 84with the name of the service it is interested in (in this example,“RDBMS service”). In some embodiments, the service locator 84 alsoprovides a level of workload balancing in the service framework, whichis discussed further below. In addition, the service locator 84periodically updates and loads existing services.

[0072] Accordingly, FIG. 3 illustrates the service framework thatprovides significant extensions to CORBA-based distributed objectnetwork system in accordance with some embodiments of the presentinvention. Those skilled in the art will recognize that the serviceframework can also be provided in a system with multiple servers withmultiple (distributed) services and (distributed) clones of services.

[0073] In some embodiments, a WWW (world wide web) or web application isprovided using the service framework of the present invention. Inparticular, the client 80 includes a client application such as a webbrowser (e.g., Netscape Navigator) that is capable of rendering HTML(hyper-text markup language) or hosting JAVA applets or ActiveXcontrols. A web server (not shown) that supports HTTP requests from theweb browser and launches a thin CGI/ISAPI/NSAPI to deliver HTTP requeststo the application server (e.g., the server 88) is provided.

[0074] For example, a web browser in the client 80 dispatches an HTTPrequest to the web server which launches a CGI, NSAPI, or ISAPI plug-inthat is a client to the application server (e.g., the server 88). Inparticular, the plug-in represents a CORBA client that issues a call tothe service locator 84 to locate an appropriate service (e.g., a webservice). Thus, the application server may be completely isolated fromthe web server. The plug-in then dispatches the web event to the webservice. The application server may provide a service implemented usingCORBA objects that are activated (i.e., instantiated) prior to theincoming call for the service provided by the application server. Theweb service executes the application logic, interacts with one or moredata services in the application server, and stores session and stateinformation. The web service returns the result of the web event whichis an HTML page back to the web browser in the client 80. Alternatively,the web browser can host a JAVA applet or an ActiveX control thatconnects directly to the application server using the IIOP. Thus, theJAVA applet bypasses the web server and interacts directly with theapplication server using the IIOP, which may provide improvedperformance.

[0075] Accordingly, the service framework of the present inventionprovides a distributed, fault tolerant, scalable, and object-orientedarchitecture that supports a variety of services that can be accessedover the web (e.g., enterprise web applications). Further, the serviceframework of the present invention provides a reservation mechanism (asdiscussed below) that supports web applications that can efficiently andfairly manage a collection of resources that are accessed by clientsacross the Internet. Also, the dynamic scalability and fault toleranceof the present invention is particularly advantageous for enterpriseapplications that can afford little or no downtime.

[0076]FIG. 4 is a flow diagram illustrating the operation of a serviceproxy in accordance with some embodiments of the present invention. Inparticular, FIG. 4 shows the stages of operation that must be performedby a client that wishes to execute, in the RDBMS example, an SQLrequest. In such an embodiment, rather than having the client performeach of the steps necessary to get a worker object, a service proxyobject is provided, and the service proxy object performs the necessarysteps to encapsulate the process of obtaining a worker for a particularservice. Thus, from the client's perspective, the client simply requestsexecution of an operation on a service, and the process of obtaining aworker is transparent to the client, because the service proxy performsthe necessary steps to obtain the worker. This approach is advantageous,because this approach encapsulates the framework internals so thatclients are not required to know anything about the framework internals,and clients do not have to perform the task of handling any errorsencountered when executing the steps for obtaining a worker. Moreover,this approach advantageously allows the framework to support faulttolerant features such as automatically retrying the request if theoriginal request fails (e.g., error handling).

[0077] In some embodiments, the service proxy is an object that residesin the client and encapsulates a particular service in its entirety. Insome embodiments, the service proxy intercepts every call from theclient to the worker, and the service proxy includes the same set ofmethods as the worker. For example, a client applet (e.g., a JAVA appletsuch as CNdDataServiceProxy dsproxy=newnetdyn.services.ds.client.CNdDataServiceProxy()) invokes an operation onthe worker by executing the equivalent method on the service proxy(e.g., an execute SQL call). The service proxy then reserves a worker(e.g., if a worker is not already reserved, then allocateWorker iscalled on the service to obtain a worker object reference) anddispatches the method (e.g., the execute SQL call) to the worker using aCORBA/IIOP call. The service proxy obtains the result of the method fromthe worker and then passes the result back to the client. Thus, theservice proxy is aware of every method invocation on a worker.

[0078] In some embodiments, each service in the service framework has acorresponding service proxy. Thus, a service proxy encapsulates aservice, because from the client's perspective the service proxyimplements all the operations of the service itself. For example, theservice proxy is responsible for insuring that a worker handle has beenobtained before invoking the operation on the worker. As a result, thetask of locating the appropriate service and obtaining a worker isencapsulated within the service proxy. For example, there may be asingle instance of the data service for an RDBMS, but there may be manyclients requesting access to the data service. Each client instantiatesa service proxy and invokes the “execute SQL” operation on the serviceproxy. All the instantiated service proxies (in each client) simplyinvoke the corresponding “execute SQL” operation on the service itself.Hence, from the client's perspective, the client can instantiate aservice and execute an operation on the service. However, the serviceproxy actually obtains a handle to a worker and forwards all requests tothe allocated worker.

[0079] In particular, FIG. 4 illustrates the stages of operationperformed by the service proxy in accordance with some embodiments ofthe present invention. Reference numeral 100 refers to a first stage inthis embodiment. In stage 100, the service proxy obtains a handle to theservice locator. For example, the service locator object is registeredwith the ORB by providing a unique object name (e.g., using theobj_is_ready call). After the service locator object has been registeredwith the ORB, the client can use the ORB-provided bind call and supplythe name of the service locator object to obtain an object reference tothe service locator.

[0080] In some embodiments, the service framework provides a first levelof workload balancing. In the first level of workload balancing, asdiscussed further below, the findService operation returns the handle ofa particular service locator instance selected among multiple instancesof the service locator (e.g., service locator clones). In stage 102, theservice proxy calls findService on the service locator with a servicename “RDBMS service”, in the RDBMS example.

[0081] Referring to FIG. 4, in stage 104, the service locator returnsthe handle of an available RDBMS service (e.g., an RDBMS service that iscurrently up and running). In some embodiments, the service frameworkprovides a second level of workload balancing. In the second level ofworkload balancing, as discussed further below, the service locatorperiodically requests and loads statistics from all the services (e.g.,across machines in a distributed configuration) and then can provide thehandle of the least busy instance of the requested service based on thestatistics. Also, at this point, security checks may be performed toinsure that the client ID, as discussed further below, is valid for therequested service access (e.g., license restrictions).

[0082] In stage 106, the service proxy calls allocateWorker on theservice object with a set of access requirements and worker hints, asdiscussed further below with respect to FIG. 9. In some embodiments, athird level of workload balancing is provided. In particular, asdiscussed further below, when a call to allocateWorker on the serviceobject is presented, reservations are requested based on some level ofaccess or some class of access specified by the service proxy. Therequested worker access must be valid relative to the client ID of theclient of the requesting service proxy. For example, a specific durationof reservation may be requested, exclusive access may be requested, oraccess that chooses to wait or not to wait may be requested as discussedfurther below with respect to FIG. 9. Thus, in the third level ofworkload balancing, a client that is itself a high priority client thatis permitted access for high priority work and thus can request a highpriority worker. The reservation mechanism is implemented in the serviceobject and is discussed further below.

[0083] In stage 108, the allocateWorker operation returns an appropriateworker as determined by the service object's load balancing manager(LBM) based on runtime workload statistics of each worker as discussedfurther below with respect to FIG. 25. In stage 110, the service proxyuses the worker handle to execute the SQL request on the worker, and theworker returns the output from the execution to the service proxy of theclient. In stage 112, the service proxy calls releaseWorker on theservice object to release the reservation on the worker.

[0084] Accordingly, FIG. 4 illustrates the advantages of providing aservice proxy that encapsulates the process of requesting a service. Forexample, the service proxy implements fault tolerance. In particular, ifa client requests a service (e.g., in a wait mode as described furtherbelow with respect to FIG. 9) and there are no available workers for theservice, the service proxy can request access to a worker for theservice and the request is queued (e.g., in a first in, first out (FIFO)queue). Further, the service proxy can implement a fairly sophisticatedsystem of retrying. For example, if the service proxy requested a workerfor a service and the worker was not available, the service proxy canretry by re-requesting the service from the service locator. Moreover,the fault tolerance mechanisms performed by the service proxy arecompletely transparent to the client.

[0085]FIG. 5 shows the service proxy 82 of the client 80 of FIG. 3 ingreater detail in accordance with some embodiments of the presentinvention. The service proxy 82 encapsulates the complex logic involvedin requesting a service.

[0086] In some embodiments, each service has a service proxy that hasthe same set of operations as the service itself (i.e., the serviceproxy interface is identical to the interface of the service's workers).The client uses the service proxy by simply instantiating the serviceproxy and executing an operation on the service proxy. Thus, from theclient's perspective, the service proxy is simply the service (i.e., theback-end resource).

[0087] As shown in FIG. 5, the service proxy 82 has the followingresponsibilities: bind to the service locator, find an appropriateservice using the service locator, obtain a worker using the serviceobject, and execute the service request using the allocated worker. Inparticular, the service proxy 82 can bind to the service locator byusing the service locator's instance name. After binding to the servicelocator, the service proxy 82 caches the reference of the servicelocator for subsequent lookup. Each service has a property called theinstance name, which represents the name of the service which does notchange for a particular service. Because the service proxy 82 has aone-to-one correspondence with a particular service, the service proxy82 knows the instance name of the service it represents. The servicelocator includes a repository of handles to services that have beeninstantiated by the SM. The service proxy 82 uses the getServiceoperation, passing the name of the service instance, to obtain a servicehandle. These operations will return the handle to a suitable service ifavailable. Once it obtains a handle to a service, the service proxy 82caches the reference for subsequent lookup.

[0088] In some embodiments, the service proxy 82 uses a default set ofreservation properties when reserving a worker. A client is allowed tochange the reservation properties before issuing any operations on theservice proxy 82 and at any subsequent time. The worker ID (e.g., handleto the worker) of the reserved worker is cached along with thereservation context that contains the client ID, the service, and thereserved worker, as discussed further below with respect to FIG. 10. Thereservation context is passed automatically by the service proxy 82 oneach outgoing call. Thus, the service proxy 82 also hides the internaldetails such as reservation properties, reservation context, etc. Theservice proxy 82 may perform invalidation of cached referencesperiodically in order to detect any changes in the configuration (e.g.,new service locators, new service clones, new workers). Invalidating thecache periodically also forces the service proxy 82 to talk to theservice locator periodically and therefore improves the effectiveness ofthe service locator and the service object in performing dynamicworkload balancing. The client may also be allowed to, at any time,invalidate the cached worker service and the cached service locatorhandles.

[0089]FIG. 6 is a flow diagram illustrating the operation of the client80 of FIG. 3 during a service request in accordance with someembodiments of the present invention. In particular, FIG. 6 illustratesthe operation of performing a service request from the perspective ofthe client 80 of FIG. 3. Reference numeral 140 refers to a first stagein this embodiment. In stage 140, the client 80 requests a service, andthe appropriate service proxy intercepts the service request (asdiscussed above). In stage 142, the client executes the execute SQLrequest on the instantiated service proxy (e.g., the service proxyincludes the same methods included in the workers of the requestedservice, and the service proxy executes the request on the allocatedworker). Finally, in stage 144 (the worker returns the output from theexecution of the execute SQL request to the service proxy of the client,and) the service proxy forwards the output to the client. Accordingly,the service proxy 82 encapsulates the requested service operation, whichsignificantly simplifies the operation from the perspective of theclient.

[0090]FIG. 7 shows a service object 160 of the server 88 of FIG. 3 inaccordance with another embodiment of the present invention. Inparticular, in response to an allocateWorker call by the service proxyon the service object 160, the service object 160 allocates a workerfrom its worker pool, workers 162 and 164, to the service proxy so thatthe service proxy can issue work requests. The service object 160 mayuse an LBM (load balancing manager), as discussed further below withrespect to FIG. 25, to select a worker for the service proxy, then theservice object 160 calls newClient on the selected worker 162 to notifythe worker of the reservation. Also, if the service proxy wants torelease the reserved worker 162, then the service proxy callsreleaseWorker on the service object 160, and the service object 160calls clientReleased to notify the worker 162. Each service object (insome embodiments, there is only one service object per service) controlsits own pool of workers. Thus, the service object 160 controls the poolof workers including workers 162 and 164. In some embodiments, workerallocation is implemented to support a variety of modes of access toworkers, provide fast response time, and balance the workload fromclient requests across all workers in the worker pool.

[0091] In particular, the workers 162 and 164 encapsulate a limitedresource such as an RDBMS. The service object 160 provides access to thelimited resource and ensures that all clients get their fair share ofaccess to the limited resource. For example, worker should be relativelyequally loaded at all times to ensure reasonably predictable throughputand linear scalability. Ignoring hardware and operating systemscalability limitations, adding more workers may increase throughput.Accordingly, to handle these requirements, the service object 160 mayinclude a worker reservation mechanism as discussed further below.

[0092]FIG. 8 provides a call to allocateWorker on a service object inaccordance with some embodiments of the present invention. The call toallocateWorker on a service object includes parameters for specifyingthe service, the service properties, and the reservation context. Inparticular, the reservation context provides some history of the client.For example, any particular workers that have performed work for theclient may be provided in the reservation context. Thus, the service canallocate the same worker to a client that had previously done work forthe client. In some embodiments, a worker may cache work results andworker hints respecting a particular client ID so that if the sameworker is reallocated to the client, then the worker has thisinformation already cached with respect to the client. Accordingly,caching work results and worker hints would be particularly advantageousfor a data service or any other service in which the client wouldbenefit by returning to the same worker that cached previous workresults. Also, the reservation context may include a client ID andsecurity credentials for clients. Thus, for security reasons, theservice can actually recognize a client using the client ID. The clientID is discussed further below with respect to FIG. 10.

[0093]FIG. 9 provides reservation properties 166 in accordance with someembodiments of the present invention. In particular, a client canreserve a worker in the mode that is most suitable to the task to beperformed by the client. Thus, some or all of the reservation properties166 provided in FIG. 9 may be implemented. In particular, the serviceobject 160 of FIG. 7 implements a reservation interface, as discussedfurther below with respect to FIG. 12. During the period that the workeris reserved for a particular client the client is allowed access to theworker. Once the reservation expires the client is no longer allowedaccess to the worker. Accordingly, to reserve a worker, the clientissues a request specifying the reservation properties that apply to thereservation and any worker hints that the client would like to pass onto the service object.

[0094] As shown in FIG. 9, the reservation properties 166 include aclient priority, an access mode, a wait mode, and a reservation time.The reservation properties 166 control the type of reservation that theclient would like to obtain before the worker attempts to perform thedesired action. Reserving the worker in the appropriate mode iscritical. For example, if the worker allows multiple clients for readaccess, but a single client for write access, the client performing awrite operation must reserve the worker in exclusive mode. Thereservation duration can be passed as a hint (e.g., a parameter providedin a call or method invocation) by the client for worker schedulingpurposes. For example, the service object may prefer a short durationclient over a long duration client. In some embodiments, there is nofixed time duration that classifies a request as short or long, andthus, this is simply up to the client's discretion.

[0095] Referring to FIG. 9, the client priority determines how quicklythe client can obtain a worker. In some embodiments, the worker poolthat is instantiated by the service object contains workers of high,medium, and low priority as discussed further below with respect to FIG.27. The properties of the worker pool determine how many workers areinstantiated and how many of each priority. In some embodiments, a highpriority worker can only be used by a high priority client, a mediumpriority worker can be used by high and medium priority clients, and alow priority worker can be used by clients of any priority. Thus, a highpriority client waits only for any existing high priority clients.Hence, it is up to the client to decide its priority level beforeissuing the request.

[0096] Referring to FIG. 9, the access mode determines how manyconcurrent reservations can be given out on a particular worker. ThemaxClients property is a property of the workers that determines thenumber of concurrent reservations in shared mode that may be allocatedto the worker (i.e., workers with maxClients greater than one representmulti-threaded workers). In exclusive mode, only one reservation ispermitted. Accordingly, the access mode facilitates the maximum numberof concurrent accesses to a limited resource.

[0097] Referring to FIG. 9, the wait mode specifies the appropriateaction if a suitable worker is not available. For example, the clientcan elect to wait in a queue indefinitely until a suitable worker isavailable (e.g., indefinite_wait), the client can elect to wait for alimited time period before re-obtaining control (e.g., timed_wait), orthe client may simply choose not to wait at all (e.g., no_wait).

[0098] Referring to FIG. 9, the reservation time specifies the durationof the reservation (in msec). The client obtains a reservation for aworker. The reservation is guaranteed for the reservation time (i.e.,the reservation cannot be revoked during this time period). However,choosing a long time period has its consequences. For example, if theclient were to disappear or exit due to an error, the worker is lockedfor this time period. This is undesirable in situations in which theworker is transient (e.g., JAVA applets). A more stable client (e.g., atransaction manager) may choose to use a longer reservation time. On theother hand, choosing too short of a reservation time may cause frequentrevocations and interruptions in the work being performed.

[0099]FIG. 10 provides a reservation context 168 in accordance with someembodiments of the present invention. In some embodiments, the client 80of FIG. 3 passes a reservation context 168 on each call to the serviceobject 90 of FIG. 3. The reservation context 168 contains informationabout the client, the service, and the last worker that was reserved.The service object 90 of FIG. 3 uses the reservation context as a hintwhen allocating a worker. Once a worker is allocated, the service object90 of FIG. 3 modifies the worker key with the reserved worker ID. Theworker checks the reservation context 168 to make sure that the clientkey is present and keeps track of clients that have reserved the worker.The worker also verifies if the client has a valid reservation (e.g.,the reservation has not been revoked).

[0100] In particular, the reservation context 168 as shown in FIG. 10includes a client key, a service key, and a worker key. The client keyidentifies the client, the service key identifies the service objectthat returned the previous reservation, and the worker key identifiesthe previously reserved worker.

[0101] For example, if the client had previously reserved a worker, butthe reservation has expired and the client would like to, if possible,return to the same worker, then the client simply passes the worker keyas a hint to the service object's reservation mechanism. The serviceobject will try to allocate the hinted worker if it is available. Ifnot, the service object may allocate the next available worker in theworker pool.

[0102] The client can modify the reservation properties 168 using thesetReservationProperties operation of the service object's reservationinterface. For example, the setReservationProperties operation can beused to extend an existing reservation before the reservation expires orto change the mode of reservation from exclusive to shared once thecritical updates in a database have been completed.

[0103]FIG. 11 provides service properties 180 in accordance with someembodiments of the present invention. In particular,service.instanceName provides the instance name of the service.Service.label provides the user visible name of the service.Service.description provides a description of the service.Service.serviceID uniquely identifies the service. Thus, multipleservices with the same instance name and properties will have adifferent value for Service.serviceID. Service.processLocation provideswhether the service is in process or out of process (e.g., indicatingwhether or not the service will be launched in its own virtual machine(VM)). For example, a C++ object is preferably launched outside a JAVAVM. Service.type provides the JAVA class name of the service. Thus,Service.type indicates the class to instantiate to bring up the service.Service.numWorkers provides the number of workers maintained by theservice. Service.numHighPriWorkers provides the number of high priorityworkers maintained by the service. Service.maxWorkerRestarts providesthe maximum number of times a service will attempt to restart a failedworker. Service.launchSequence provides that the lower the number theearlier the SM will launch the service (e.g., the results may beambiguous if there are two services with the same launch sequence).Service.inactiveManagerInterval provides the minimum number ofmilliseconds before a service considers its parent SM dead andterminates itself.

[0104]FIG. 12 provides a reservation interface of the service object 90of FIG. 3 in accordance with some embodiments of the present invention.The reservation interface shown in FIG. 12 is written in standardInterface Definition Language (IDL).

[0105]FIG. 13 is a flow diagram illustrating a reservation revocationoperation in accordance with some embodiments of the present invention.For example, the service object 90 of FIG. 3 may include a reservationmechanism that performs the reservation revocation operation (i.e.,asynchronous reservation revocation). In particular, if a workerreservation has expired, the client whose reservation expired is notimmediately denied access to the worker. Reference numeral 200 refers toa first stage in this embodiment. In stage 200, the reservationmechanism determines whether or not a client's reservation on a workerhas expired. If the client's reservation on the worker has not expired,then as shown in stage 202 the client may continue to use the worker.However, if the client's reservation on the worker has expired, then asshown in stage 204 the reservation mechanism determines whether or notother workers are available. In particular, this allows the client tocontinue to use the worker until no workers are available for a newclient. In stage 206, the reservation mechanism revokes the client'sreservation on the worker assuming that, at this time, there are noworkers available for new clients, and a new client is requesting aworker (i.e., all the workers in the pool have been reserved bymaxClients clients or clients in exclusive mode). Thus, in stage 206,the reservation mechanism revokes a reservation of a client whosereservation period has expired.

[0106] In some embodiments, a reservation is revoked only if there areno available workers (i.e., every worker has the maximum number ofclients reserved), and there is at least one expired reservation. Ifthere are no available workers, then the reservation mechanism willrevoke the oldest-expired reservation on the least-loaded worker.

[0107] In particular, reservations can be revoked only if the periodspecified in the reservation properties has expired. Reservations cannotbe revoked if the reservation time of FIG. 9 has not expired. Thus, theclient can continue to use the worker until the worker is revoked. Oncea reservation has been revoked, the worker now has room for at least onemore client. This available slot is given to a new client that requesteda worker. Such a revocation causes a call back notification to be sentto the client's service proxy. As discussed above, the client's serviceproxy for the service encapsulates the entire worker reservation logic.The service proxy uses the notification to invalidate any cached workerhandles. Accordingly, any further request on the service proxy from theclient will cause the service proxy to obtain a new reservation for aworker before proceeding with the request.

[0108]FIG. 14 is a flow diagram illustrating the operation of reservinga previously reserved worker in accordance with some embodiments of thepresent invention. Reference numeral 220 refers to a first stage in thisembodiment. In stage 220, the client passes the worker ID of thepreviously reserved worker (e.g., in the worker key of the reservationcontext 168 of FIG. 10) as a worker hint to the reservation mechanism(e.g., of the service object 90 of FIG. 3). In stage 222, the serviceobject will attempt to allocate the previously reserved worker if it isavailable. In stage 224, if the previously reserved worker is available,then the service object allocates the previously reserved worker asprovided in the hint. However, if not, then the service object simplyallocates the next available worker, in stage 226.

[0109] As discussed above with respect to FIG. 10, the reservationcontext 168 of FIG. 10 contains information about the client, theservice, and the previously reserved worker. The service object uses thereservation context as a hint when allocating a worker. Thus, in someembodiments, once a worker is allocated, the service object fills in thereservation context with the reserved worker ID, as shown in stage 228of FIG. 14. The worker also may check the reservation context to makesure that the client ID is present and to keep track of clients thathave reserved the worker. The worker may also verify if the client has avalid reservation (e.g., that the reservation has not been revoked).

[0110]FIG. 15 provides a reservation revocation call back interface inaccordance with some embodiments of the present invention. Thereservation revocation call back interface as shown in FIG. 15 iswritten in IDL.

[0111] In some embodiments, a call back notification mechanism increasesscalability. For example, a client can continue to use an expiredreservation until the concurrent load on the system increases to a levelat which reservation revocations begin to occur. Thus, a client may notrelease a worker even after obtaining a new reservation, and still notcause new clients problems when trying to access a limited resource.Even if the call back notification fails to reach the client, the workerwill still be notified of the revocation. If the service proxy issues arequest to the worker using an expired reservation, the worker raises anexception indicating this problem. In particular, as discussed abovewith respect to FIG. 14, the worker checks the reservation context ofeach client that attempts to use the worker. Thus, the service proxy,upon receipt of the exception raised by the worker will obtain a newreservation and retry the request. Of course, once the reservation ofthe client expires, the service object is free to revoke the client'sreservation and offer the reservation to some other client. When such anevent occurs, a notification is sent to the client's service proxy thatholds the reservation. The client's service proxy can then invalidatethe worker reference immediately. Any new operations performed on theservice proxy will force it to allocate a new worker before dispatchingthe operation.

[0112] Accordingly, a service proxy obtains a reservation on a workerobject and may not release the worker until there is a lack of freeworkers and there are competing clients for the workers. In other words,if a service proxy has reserved a worker and there is no contention forthe worker, the service proxy may never release the worker. In such acase, the service proxy does not issue releaseWorker and allocateWorkerrequests to the service object, and the service proxy simply continuesto use the worker object until no longer needed. If there is a workercontention and a reservation has to be revoked, then the reservationrevocation callback interface is used and a callback is issued to theservice proxy that is holding the worker (e.g., the reservation time ofFIG. 9 has expired).

[0113]FIG. 16 provides worker properties 230 in accordance with someembodiments of the present invention. As discussed above, the workersprovide an encapsulation of a limited resource. Like the service object,the worker interface derives from the administrative (admin) layer. Forexample, the service object 90 of FIG. 3 uses the worker's admin layerto activate and deactivate the workers 92, 94, and 96. The workerinterface implements operations that allow the service to notify theworker about its reservations. The worker uses the reservationinformation (e.g., reservation properties 166 of FIG. 9 and reservationcontext 168 of FIG. 10) to disallow unexpected or expired clients fromaccessing the worker. The worker also keeps track of clients that havereserved the worker and ensures that the number of clients does notexceed maxClients. The service instantiates the worker either in processor out of process as discussed further below with respect to FIG. 19.Once the service object instantiates the workers into the workerregistry, the service object calls setProperties to pass on the workerproperties. Once the worker receives the properties and initializes theinternal data structures such as the worker's client list, the worker isready to receive client requests.

[0114] In particular, the worker properties 230 are provided in FIG. 16.For example, worker.numProcesses provides the number of processes tostart up to support the number of workers supported when the workers areout of process. In some embodiments, the numWorkers divided by thenumProcesses equals the numWorkersPerProcess. Worker.processLocation iseither in process or out of process indicating whether the workers willrun in the same VM as the service. Worker.type indicates, for example,the JAVA class that will be instantiated for JAVA workers (either inprocess or out of process). Worker.javaVM specifies the JAVA VM used tolaunch worker.type for out of process JAVA workers. Worker.javaDebugVMis the default command to run the JAVA VM in debug mode (e.g.,JAVA_g-debug). Worker.exe indicates the command line to launch (C++)out-of-process workers.

[0115] Worker.portNumber is the port number on which out of processworkers receive requests. A port number represents a numbered networkconnection. For example, a telephone number is a port number in thetelecommunication network. The Internet is based on the TCP/IP networkprotocol. Thus, in the Internet context, the port number is local to aserver (i.e., unique within the server), and the port number is a uniqueinteger that represents the IP address of the network connection to theserver. For example, an allocated worker listens for an incoming requeston the IP address of the network connection, and the incoming requestmay be an operation that is exported by the worker in the IDL interfaceof the worker.

[0116] Worker.maxClients is the maximum number of simultaneous clientsthat are allowed access to a single worker. If the maxClients value isgreater than 1, then the worker is thread safe (i.e., multi-threaded).

[0117] Worker.inactiveServiceInterval is the number of millisecondsbefore the worker considers its service dead and exits. In particular,each worker is periodically pinged by the service object, and thepinging interval depends on the service's pingInterval property (i.e.,the Service.pingInterval period is less than theWorker.inactiveServiceInterval period). The pinging of workers may beused to obtain runtime workload statistics from the workers for workloadbalancing purposes and also to obtain the current state of the workers.For example, if a worker has failed for some reason, the stateindication will help the service object restart the worker and bring theworker back online, which represents part of the fault tolerance aspectsof the service framework of the present invention. Thus, if the workeris not pinged by the service object within theWorker.inactiveServiceInterval property, then the worker assumes thatthe service object is no longer online, and the worker terminatesitself. The worker terminates itself, because the service object mayhave failed without terminating its workers. If the service objectfailed, then the SM that monitors the failed service will restart thefailed service, and the restart of the failed service will cause theoriginal worker pool to be stranded, and thus, the worker terminatesitself to avoid being stranded.

[0118] ServiceType.XXX for each service is expected to have its ownproperties (prefixed by the service type), and there can be an arbitrarynumber of these properties. In particular, the ServiceType.XXX propertyis a property of a service identified by ServiceType. For example, the“RDBMS Data Service” has a service type DataService and has a propertypingInterval that defines how frequently the DataService will ping itsworkers. Thus, this property has a name (e.g.,DataService.pingInterval), and the property has a value (e.g., 200seconds). In some embodiments, the properties are implemented in JAVAand stored in a properties file, and the service manager is responsiblefor reading and writing the stored properties.

[0119]FIG. 17 is a flow diagram illustrating the operation of pingingworkers in accordance with some embodiments of the present invention.Reference numeral 240 refers to a first stage in this embodiment. Instage 240, each worker of a service is pinged by the serviceperiodically. In stage 242, it is determined whether or not the workeranswers the ping within a predetermined time period. If the worker doesnot answer the ping within the predetermined time period, the serviceobject considers the worker to be dead and re-instantiates the worker,as shown in stage 244. Each worker also includes a service object pinginterval timer. In stage 246, a worker determines whether the serviceobject has pinged the worker within a particular time interval. If not,then the worker considers the service object to be dead and the workerterminates itself as shown in stage 248. Otherwise, the pingingoperation repeats as shown in FIG. 17. Accordingly, the pingingoperation provides fault tolerance in the service framework of thepresent invention.

[0120]FIG. 18 shows a server 280 in accordance with another embodimentof the present invention. In particular, a server 280 includes an admininterface 282, a SM 284, a start/stop functionality 286, and aconfiguration 288. Because the service framework is based on adistributed object model, and there can be many objects in the serviceframework that interact with each other, it is preferred to groupobjects together into a higher level entity so that the distributedobjects can be more effectively managed. Accordingly, the configuration288 includes a collection of services and service locators along withtheir properties. The configuration 288 is maintained by the SM 284. Inparticular, when the SM 284 comes up, it comes up with a specifiedconfiguration of services and service locators, such as a servicelocator 290.

[0121] For example, a configuration may contain the following services:an RDBMS service with two clones, a session/state management servicewith one worker, and a service locator with three clones. Theconfiguration is then given a name, and the property files for eachservice instance in the configuration are stored within theconfiguration.

[0122] Referring to FIG. 18, the configuration 288 includes a cell. Acell represents a distributed configuration. Two hosts with the sameconfiguration name but different contents can link up together to form asingle cell. The cell is designated by a list of hosts that theconfiguration spans. Once the configurations are linked into a cell, theset of service locators and services is common to the entire cell (i.e.,each service locator contains a list of all services in all hosts in thecell). Thus, clients accessing any service locator in the cell canaccess any service in the cell. The cell is maintained in a consistentmanner by the SM managing the individual configurations. SM operationsmay include many different functions such as managing the service andservice locator instances, providing fault tolerance by pinging asdiscussed above with respect to FIG. 17, and exporting a managementinterface to administer the services.

[0123] Referring to FIG. 18, the management interface can be used byadministrative tools to change the SM's configuration (e.g., add a newservice, change service locator clones, etc.). Thus, the managementinterface can be used to modify the operations of any of the entities inthe configuration (e.g., change numWorkers for an RDBMS service).Finally, the management interface offers mechanisms 286 to start/stopeach instance of a service or service locator and obtain the currentstate of each of the instances in the configuration.

[0124] Further, the service locators are linked to the configurations.Thus, a service locator in a configuration on a particular serveractually knows about services on another system, the distributedconfiguration or the cell. Thus, if a request for a service is made tothe service locator 290, then a client can actually get access to aservice provided by the server 280, but may also be provided access to aservice instance in the cell which is residing on a server that may beanywhere on the network (e.g., the global Internet). Moreover, thisentire process is transparent to the client, because the client'sservice proxy is simply returned a service handle.

[0125] However, the SM 284 is only responsible for services in its ownconfiguration 288. Thus, the SM 284 is responsible to supply theinformation regarding its configuration 288 to all service locators inthe cell. The service locators in the cell are well known (i.e., theyare actually embedded in the cell's description itself). Thus, given acell, a connection can be made to service locators in the cell. Thus,the SM 284 knows the list of services, loads this information and theproperties of the service, and simply forwards such informationperiodically to the service locator 290. Thus, the service locator 290can assume that the information provided from the SM 284 representscurrently available services.

[0126]FIG. 19 shows an out-of-process factory 300 and an in-processfactory 308 in accordance with some embodiments of the presentinvention. The number of workers instantiated by the service is limitedby the service property service.numWorkers. All the workers may be ofthe same type or of different types (e.g., high priority and lowpriority workers). The workers are instantiated when the service isinitialized by the SM (i.e., the properties of the service and theworkers are passed to the service object). Workers can be instantiatedin a separate process or within the same process as the service objectas shown by out-of-process factory 300 and in-process factory 308,respectively. Workers are instantiated by using worker object factories,as shown by an object factory 302 and an object factory 312.

[0127] The object factory is an object that can instantiate or fabricateany number of objects of a given type. For example, in anobject-oriented language like JAVA or C++, to instantiate an object isto create a new object using the “new” syntax operator, and in CORBA, aCORBA object is instantiated using the “obj_is_ready” operation definedin CORBA. A worker object factory such as the object factory 302 and theobject factory 312 can instantiate any number of worker objects. Inparticular, the object factory 312 is a CORBA object that can be in thesame process (in process). Thus, the service object (e.g., the serviceobject 90 of FIG. 3) can then use the object factory 312 to create type1 workers 318 (e.g., low priority workers) and type 2 workers 320 (e.g.,high priority workers). In some embodiments, the object factory 312 is aCORBA object implemented in JAVA.

[0128] Referring to FIG. 19, the object factory can also be in aseparate process (out of process), in which case, the service objectspawns (forks) a separate process, and the process then instantiates anobject factory and passes a handle to the object factory back to thecreator. In particular, the spawned process creates the object factory,registers the object 302 (i.e., the object factory) with the ORB, andpasses a reference to the object back to the parent process using thestandard I/O string. The object and the parent process (e.g., theservice object 90 of FIG. 3) that spawned the object factory process usethe object reference of the object factory to instantiate the necessaryobjects in that process (e.g., the worker pool). Thus, the serviceobject can then use the object factory 302 to create type 1 workers 314and type 2 workers 316. In some embodiments, the object factory 302 is aCORBA object implemented in JAVA.

[0129] Thus, the object factory is a flexible mechanism to create andmanage pools of similar objects (i.e., objects that have the same typeand the same set of properties) such as worker pools. In someembodiments, object factories are used to create the service objects (inthe SM), the worker pools (in the service), and other objects such asthe service locator.

[0130] In some embodiments, the object factory is a remote CORBA objectthat is implemented in JAVA and C++. Thus, a JAVA-based factory caninstantiate any CORBA object that is implemented in JAVA andinstantiates the object in the same address space as the factory.Accordingly, if the factory object is in a separate address space, anyobjects created in the factory are in the separate address space of thefactory object (i.e., in the separate process). Because the JAVA-basedfactory object can instantiate any JAVA-based CORBA object, the sameJAVA-based factory may be used to instantiate a service locator (e.g., aJAVA-implemented CORBA object) or a service object (e.g., aJAVA-implemented CORBA object) and, thus, reside in the same addressspace.

[0131] In contrast, in some embodiments, an object factory implementedin C++ is more limited in function than a JAVA-based object factory,because the object factory implemented in C++ can only instantiatesimilar objects (e.g., due to limitations of the C++ language). Forexample, the DataService workers are implemented in C++, and theDataService worker objects are instantiated in a separate process usingthe object factory implemented in C++. Thus, the object factoryimplemented in C++ can only instantiate DataService worker objects.

[0132] In another embodiment, the number-of workers can fluctuatedynamically using an object factory. Thus, new workers can be added orthe number of workers can be reduced, based on parameters such as theworkload on the existing number of workers. In particular, theconfiguration (e.g., the configuration 288 of FIG. 18) may provide aminimum or maximum range of the number of workers for a service, and thenumber of workers can be implemented to fluctuate dynamically within theconfigured range depending on various parameters such as the workload onthe present number of workers.

[0133]FIG. 20 shows clone factories in accordance with some embodimentsof the present invention. An object factory can instantiate basicallyany CORBA object. The object factory instantiates a CORBA object usingCORBA (and JAVA) introspection to determine the type of object to becreated and its parameters (e.g., object name). The SM uses thisfunctionality to create clone factories. In particular, a clone isanother instance of an entity that behaves exactly like the originalentity (i.e., the clone has the same properties or attributes as theoriginal entity) but resides in a different address space. For example,a service locator clone is another instance of the service locatorobject in the same machine, and the service manager may instantiate aservice locator clone in the same machine for fault tolerance purposes(e.g., if one service locator instance fails, the other service locatorinstance is still available).

[0134] Referring to FIG. 20, a SM 320 has created clone factories 322and 324. The SM 320 creates factories for each service and servicelocator. Each clone of a service (or service locator) is located in adifferent clone factory. For example, if there are multiple clones for aservice, the first may be located in the clone factory 322, the secondin the clone factory 324, etc.

[0135] For example, two service locator clones and two service clones(e.g., of the DataService) may be provided. Each clone has a clone ID.Thus, one service locator clone instance may have a clone ID 0, and theother service locator clone instance may have a clone ID 1. Similarly,one DataService instance may have a clone ID 0, and the otherDataService instance may have a clone ID 1. The SM 320 instantiatesthese clones. In particular, the SM 320 instantiates two objectfactories, one for clone 0 instances and the other for clone 1instances. The two clone factories may be in different address spaces(processes), which provides fault tolerance in the service framework ofthe present invention (e.g., insuring that no two clones are in the sameaddress space provides fault tolerance). The clone factory 0instantiates the service locator clone 0 and the DataService clone 0.The clone factory 1 instantiates the service locator clone 1 and theDataService clone 1. In some embodiments, a service can have any numberof clones, and the SM 320 will instantiate the appropriate number ofclone factories.

[0136] Generally, a process represents an operating system term that mayalso be used to refer to an address space. In particular, operatingsystems such as HP-UX™ or Windows NT™ use the term process to refer to aregion of computer memory that is separated from the rest of thecomputer's memory (e.g., allocated memory). Thus, a region of memory isallocated to a program that is being executed. The program is launchedby executing an executable file (e.g., .exe in Windows NT). Once theprogram terminates, the region of memory is deallocated and returned tothe computer's memory pool. A program needs a region of memory tomaintain data that it has read from the terminal, file system, or fromthe network. The program manipulates the data and performs its work allwithin the allocated region of memory. Accordingly, a process is aprogram that is executing within a region of memory allocated by theoperating system. Thus, an out-of-process worker represents a workerobject that is instantiated in a process that is separate from theprocess where the service object resides. An out-of-process serviceobject represents a service object that resides in a process that isseparate from the process where the SM object resides. Clone factoriesrepresent a set of factory objects with each factory residing in aseparate process.

[0137] Accordingly, clone factories such as the clone factories 322 and324 of FIG. 20 provide additional fault tolerance for the serviceframework of the present invention. In particular, clone factories arelocated in different processes so that each clone provides additionalfault tolerance and high availability. Once the service object (e.g.,the service object 90 of FIG. 3) instantiates a worker pool, the serviceobject pings the workers to make sure that the worker pool is alive andwell, as discussed above with respect to FIG. 17. If a worker in thepool does not respond to the pings, the service object re-instantiatesthe failed worker using the appropriate factory. Also, it should beapparent that two clones may be implemented on the same machine. Such animplementation would be useful for providing fault tolerance. Inparticular, a clone provides basically an identical worker pool, thus,providing the same or nearly identical work as the original worker pool.As a result, if a process or VM is lost, the state that the clients haveset up is not completely lost. In particular, in such an event, becausethe clients are distributed among the clones on the same machine a lossof a process or VM on a particular server will affect some of theclients but not all of the clients attached to the server. Hence, if aparticular address space fails abnormally, then clones mayadvantageously provide for fault tolerance and high availability,because clones in different address spaces may not have been affected.Accordingly, the service framework of the present invention provides afault tolerance architecture.

[0138]FIG. 21 provides an object factory interface in accordance withsome embodiments of the present invention. The object factory interfaceof FIG. 21 is written in IDL.

[0139]FIG. 22 provides service locator properties 330 in accordance withsome embodiments of the present invention. As discussed above withrespect to FIG. 20, service locators may be cloned, but the servicelocator clones may not communicate with each other. A service locatorproxy keeps track of the various service locators in the network and canselect the service locator with a minimum workload on which to performits service lookups. In some embodiments, the service locator proxy isused by almost all the modules in the framework including the serviceproxy to encapsulate access to the service locator. Thus, the servicelocator proxy provides another level of workload balancing management inthe service framework of the present invention.

[0140] The service locator properties 330 are shown in FIG. 22. Inparticular, sl.javaVM is the VM command line used to launch theout-of-process service locators, sl.javaDebugVM is the default commandto run the JAVAVM in debug mode, sl.inactiveManagerInterval is theminimum number of milliseconds that can elapse before the servicelocator considers the VM out of service and unregisters its services,sl.locatorId is the unique identifier of the service locator (e.g., theservice locator instance's object name), and sl.owningManagerId is theidentifier of the owning manager (e.g., the SM 86 of FIG. 3 assumingthat the SM 86 launched the service locator).

[0141]FIG. 23 provides a service locator interface in accordance withsome embodiments of the present invention. In particular, FIG. 23provides a service locator interface written in IDL.

[0142]FIG. 24 provides a load balancing manager (LBM) interface inaccordance with some embodiments of the present invention. The LBMinterface may be written in IDL as provided in FIG. 24. The LBM is anentity of the service object. In particular, the service objectinstantiates an LBM to manage the pool of workers. The LBM may be aplug-in object that can be customized or entirely replaced.

[0143] The LBM provides a level of workload balancing in the serviceframework of the present invention. In particular, the service objectmay use the LBM to perform workload balancing among its workers in theworker pool. In some embodiments, the service framework provides twomanagers, a fully capable LBM and a null LBM. The fully capable LBMsupports access modes (e.g., exclusive mode) and also provides asophisticated scheduling scheme based on worker statistics. In contrast,the null LBM does not support access modes (i.e., the null LBM treatsthem all as the same and randomly selects a worker from the pool ofworkers). If a service encapsulates a limited resource, the fullycapable LBM would be preferred. However, if the service encapsulates anabundant resource (e.g., provides more than one worker 92 per service),or if there is the constraint that there may be only one worker perservice, then the null LBM would be sufficient and would improveperformance.

[0144] In addition, the LBM acts as a repository of worker objects. Theservice object creates the workers in the worker pool and registers eachworker with the LBM. In particular, each worker in the worker pool, assoon as it is launched, is registered with the LBM using theregisterWorker method. Even though the LBM is an entity of the serviceobject, the worker is registered with the LBM, because the LBM may be aplug-in object, so it may be independent of the service object (i.e.,does not share internal data with the service object). Once the workeris terminated, the service object unregisters the worker from the LBM'sregistry using the unregisterWorker method, and the service object mayterminate the worker pool.

[0145]FIG. 25 shows a fully capable LBM 360 in accordance with someembodiments of the present invention. The fully capable LBM uses asophisticated scheme of priority queues and a scheduler, an allocationmanager 362, to implement the allocateWorker and releaseWorker methods.The service object (e.g., the service object 90 of FIG. 3) simplyforwards these requests to the registered LBM. The fully capable LBM 360includes five priority queues. In particular, an idle queue 368 containsworkers that have no client reservations. In the idle queue 368, workersmay be sorted in increasing order of workload.

[0146] In some embodiments, the workload value of a worker is a floatingpoint number computed by the worker that represents the load on theworker. In particular, the workload value may be determined by acalculation based upon such factors as the ratio of time spent inexecuting a worker method to the elapsed time and the CPU load. Theelapsed time represents the time between pings from the service objectto the worker (e.g., the value provided by the propertyService.pingInterval). The time spent in executing a method iscalculated by summing the time spent in any operation in the worker.Thus, the ratio of the elapsed time to the time spent in executing aworker method provides an indication of the ratio of time that was spentby the worker actually performing a worker method. The CPU (centralprocessing unit or processor) load is the time spent by the computer inperforming work. Accordingly, the combination of these two factorsprovides a measure of the workload on the worker and the workload on theCPU.

[0147] Referring to FIG. 25, the usable queue 370 contains workers thathave some client reservations, but the number of clients for each workeris less than the maxClients. In the usable queue 370, workers may besorted in increasing order of a combination of workload and availableclient reservation slots. An unusable queue 374 contains workers thathave client reservations with the number of clients equal to maxClientsper worker (i.e., no more reservations are possible against suchworkers). In some embodiments, the unusable queue 374, is not sorted. Arevocable queue 372 contains workers that have one or more reservationsthat have timed out and can be revoked if necessary. In the revocablequeue 372, workers may be sorted based on the time of reservationexpiration (i.e., workers containing older revocations are higher in thequeue). A client wait queue 366 contains clients that are waiting for aworker to be allocated. In some embodiments, all workers are initiallyin the idle queue 368, and as reservations are handed out, the workersare moved into the usable queue 370, the unusable queue 374, and therevocable queue 372 as appropriate.

[0148] Referring to FIG. 25, in some embodiments, the five queues aremanaged by two background threads, a queue manager 364 and an allocationmanager 362. The queue manager 364 handles the tasks of maintaining asorted order on each of the worker queues (except the unusable queue 374which is not sorted), as described above. The queue manager 364periodically browses the unusable queue 374 and looks for expiredreservations. If a worker in the unusable queue 374 has an expiredreservation, the worker is moved to the revocable queue 372, but theexpired reservation is not yet revoked.

[0149] Referring to FIG. 25, the allocation manager 362 is the schedulerthat manages the client wait queue 366. In some embodiments, theallocation manager 362 maintains the client wait queue 366 in first comefirst serve (FCFS) order, but maintains the discretion to move clientsforward based on their declared duration (i.e., based on the reservationproperties such as the access mode, the reservation time, etc., asdiscussed above with respect to FIG. 9). For example, shorter durationrequests are generally moved ahead of longer duration requests. Also,the LBM may move workers into the revocable queue if the workers haveclients holding reservations with them that have expired (e.g., based onthe reservation timeout). In some embodiments, the allocation manager362 handles various client wait modes such as no_wait, timed_wait, andindefinite_wait as discussed above with respect to FIG. 9. Theallocation manager 362 also scans the worker queues waiting for anavailable worker.

[0150] In some embodiments, the allocation manager 362 checks for anyworker hints supplied in the waiting client's reservation context andfirst attempts to reserve the hinted worker. If the hinted worker isunusable, then the allocation manager 362 attempts to reserve the nextavailable worker. If a worker is available in the idle queue 368 or theusable queue 370, then the allocation manager 362 immediately allocatesthe available worker to the waiting client. If no workers are availablein the idle queue 368 or the usable queue 370, then the allocationmanager 362 scans the revocable queue 372 waiting for new workers toappear. If the first available worker is a worker in the revocable queue372, then the allocation manager 362 allocates the revocable worker tothe waiting client after revoking the expired reservation (e.g., thefirst worker in the revocable queue 372 is selected for a revocation andthe oldest expired reservation is revoked). The revocation involves anotification to the affected worker using the clientRelease call and acallback to the client that holds the expired reservation using thereservationTimedOut call on the ServiceProxy callback interface.

[0151] FIGS. 26A-26B are a flow diagram illustrating the call sequenceoperation in accordance with some embodiments of the present invention.In stage 400, assuming a client proxy does not already have a reservedworker for a service request, the service proxy obtains a service handlefrom the service locator. In stage 402, the service proxy issues theallocateWorker call with a set of reservation properties and anuninitialized reservation context. In stage 404, the service objectforwards the request to the LBM 360. In stage 406, the LBM determines ifany clients are waiting. If clients are waiting, then the LBM enqueuesthe request in the client wait queue in stage 408. Otherwise, the LBMproceeds to stage 410.

[0152] Referring to FIG. 26A, in stage 410, the LBM attempts to-allocatea worker from the idle queue or the usable queue, but if no workers areavailable, then the LBM attempts to allocate a worker from the revocablequeue. In particular, the reservation context is uninitialized, so thatthere are no worker hints available. Thus, the LBM takes the first entryin the idle queue. If the idle queue is empty, then the LBM checks theusable queue. If the usable queue is empty, then the LBM checks therevocable queue. If no workers are available in the revocable queue,then the LBM checks the unusable queue to see if any workers haveexpired reservations. If so, the LBM moves a worker with an expiredreservation to the revocable queue.

[0153] Once a worker is available in the revocable queue, the LBMselects the first worker in the revocable queue for a revocation andrevokes the oldest expired reservation. The revocation involves anotification to the affected worker using the clientRelease call and acallback to the client that holds the expired reservation. Once a workerwith a free reservation is available (e.g., in the revocable queue, theusable queue, or the idle queue), a new reservation is created for theclient with the appropriate reservation properties. The worker isnotified of the new client and its reservation properties, and thereservation context is appropriately modified (e.g., the reservationcontext may include current information in the client key, the servicekey, and the worker key). The affected worker moves to the appropriatequeues (e.g., from the idle queue to the usable queue or the usablequeue to the unusable queue, etc.). Thus, in stage 412, the call toallocateWorker returns with a suitably filled reservation context.

[0154] Referring to FIG. 26B, in stage 414, the service proxy stores thereservation context and then issues the request to the worker with thereservation context. In stage 416, the worker receives the request,verifies the request using the reservation context (e.g., checks whetherthe client has valid access to the worker, that is, the worker isnotified by the service object of the reservation), and then executesthe request. Finally, in stage 418, the worker returns the result of therequest to the client's service proxy which provides the result to theclient.

[0155]FIG. 27 shows a client wait queue 430 and an idle queue 438 of theLBM 360 of FIG. 25 in accordance with another embodiment of the presentinvention. In particular, as shown in FIG. 27, the client wait queue 430may include three internal queues, one for each priority level, highpriority 432, medium priority 434, and low priority 436. Similarly, theidle queue 438 may include three internal queues, one for each prioritylevel, high priority 440, medium priority 442, and low priority 444.Further, each queue of the LBM 360 of FIG. 25 may be similarlyimplemented to include priority sub-queues. Thus, if a client comes witha high priority request, the client's request is entered into the highpriority client wait queue 432 of the client wait queue 430. In someembodiments, the allocation manager always services the high priorityrequests ahead of the lower priority requests. Also, in someembodiments, workers are assigned a priority. Thus, high priorityworkers only work for high priority clients, and the high priorityworkers are initially entered in the high priority idle queue 440 of theidle queue 438. When workers are moved from one queue to another, theyare moved into the appropriate priority sub-queue. Accordingly, thisembodiment provides yet another level of workload balancing in theservice framework of the present invention.

[0156] In some embodiments, each reservation session has a specifieddurationTimeout attribute in the reservation properties. The reservationsession represents a particular reservation that a service proxy of aclient holds on a worker. The service object records the reservationsession and uses the information in the session to control thereservation (e.g., revoke the reservation once the timeout expires). Inparticular, the reservation may include reservation properties providedby the service proxy on the allocateWorker call, and the reservationcontext, which is sent back and forth between the proxy and the serviceon all calls. The reservation context includes the client key(identifies the service proxy uniquely), the service key whichidentifies the service object itself, and the worker key whichidentifies the worker uniquely within the service. Once the serviceobject reserves a worker for a client, the service object marks theworker key in the reservation context. Further, the LBM may store thereservation session as soon as the reservation on a worker is given tothe client.

[0157] Every service proxy has to supply a valid durationTimeout valuewhen calling allocateWorker. The durationTimeout determines the lengthof time (in milliseconds) that the service object (e.g., the serviceobject 90) guarantees the worker to stay reserved for this client fromthe moment the worker has been allocated. If the client has a set ofoperations to be called that must be executed on the same workerinstance, then this set of operations must be completed within thedurationTimeout interval for that work to be guaranteed. Once thisinterval expires, the service object may revoke the reservation of thisclient and offer the worker to another client that is waiting for aworker to become available.

[0158] Moreover, there are some optimizations in the reservationmechanism to reduce the number of calls made by the service proxy to theservice. In some embodiments, once the durationTimeout expires, theclient does not immediately lose the worker reservation (i.e.,asynchronous reservation revocation). The client loses the reservationonly if there exists a contention. If there are other clients waitingfor the worker, the client that has an expired durationTimeout may losethe worker. If there is no contention, the client's reservation isusually still valid, and the client can keep using the worker. If theservice object decides to revoke the reservation, then the serviceobject issues the reservationTimedOut call to notify the service proxy.The first subsequent operation on the service proxy will try to reservea new worker before executing the requested operation.

[0159] In addition to the durationTimeout, in some embodiments, there isanother time-out attribute in the reservation properties, aninactiveTimeout. The inactiveTimeout detects idle clients and revokestheir reservation. If a service proxy that has reserved a worker has notused the worker within the inactiveTimeout interval, then the workernotifies the service object, and the service object may revoke theworker and assign it to another client that is waiting for the worker.If the worker is revoked, then the service proxy is notified of therevocation using the reservationTimedOut call. In some embodiments, sucha revocation occurs only if there is contention for workers. In someembodiments, the difference between the durationTimeout and theinactiveTimeout is that the durationTimeout is handled by the serviceobject whereas the inactiveTimeout is handled by the worker.

[0160]FIG. 28 shows a service object 450 and an LBM 454 according toanother embodiment of the present invention. In particular, the serviceobject 450 periodically pings each worker in the worker pool to obtainworker statistics 452 such as the workload of each of the workers. Theservice object 450 supplies the worker statistics 452 to the LBM 454which uses the worker statistics 452 for workload balancing. Forexample, the LBM 454 may use the workload statistics 452 to sort theworker queues 458, 460, and 462 such as the idle queue, the usablequeue, and the unusable queue. If the service object ping fails on aparticular worker, then the worker is unregistered from the LBM's workerregistry 456 by the service object, and all existing reservations on theworker are revoked by the service object 450. The service object 450will then re-instantiate the worker and register the re-instantiatedworker in the LBM's registry 456.

[0161] The service object 450 maintains the runtime statistics of eachworker and constructs representative service statistics based on theworker statistics 452 such as the average of the worker numbers forcertain statistics (e.g., workload). The service object 450 periodicallyforwards this information to the parent of the SM in response to theperiodic pings from the SM (i.e., the parent of the SM is the SM thatinstantiated the service object). The SM periodically passes the servicestatistics along with the service information to the service locators inthe cell that are responsible for distributing the object references ofthe registered services. These periodic updates are responsible forkeeping the service locator's repository up to date.

[0162]FIG. 29 shows the scalability of the service framework inaccordance with some embodiments of the present invention. The serviceframework of the present invention is based on a scalable architecture.In some embodiments, the service framework includes a variety offeatures that enhance scalability such as clones (e.g., service locatorclones, service clones, etc.), multiple workers, multi-threaded workers(discussed above), asynchronous revocation callbacks (discussed above),and distributed configurations or cells (e.g., services that are part ofthe same administrative framework may be distributed across multiplemachines as described above with respect to FIG. 18).

[0163] Scalability can improve throughput almost linearly, ideally, withthe addition of resources. For example, in a SM that manages oneinstance of an RDBMS service, adding another instance of the serviceshould double throughput, in the ideal case. Such an increase inthroughput should be linear as additional service instances are added.However, in practice, scalability does not increase linearly because ofbottlenecks in the system that arise as the number of resources increasesignificantly. Also, the network bandwidth and the CPU power are limitedand therefore inhibit a linear increase in scalability.

[0164] As shown in FIG. 29, the service framework of the presentinvention can increase scalability by providing service locator clonesand service clones. A clone is another instance of an object that isessentially identical to the original instance. All clones use the sameset of properties. For example, a clone of a service represents anotherinstance of the service object with its own worker pool. The twoinstances of the service are quite independent and will perform theirown reservations and execution of requests. Typically, clones live indifferent address spaces in computer storage. Accordingly, if a clonegoes down for some reason, it normally does not affect the other clones.Thus, clones also introduce another level of fault tolerance in theservice framework of the present invention. In some embodiments, eachservice has a property that determines the number of clones for theservice. The SM starts all clones of a service when the service isstarted. Clones are applicable to any SM managed instance (e.g., aservice locator instance). A SM managed instance represents an instancethat the SM instantiated and which the SM periodically pings.

[0165] In particular, FIG. 29 provides a service locator clone 474 and aservice locator clone 476. If there are multiple service locator clones,the SM registers the service information with all available servicelocator clones. Thus, each clone is capable of providing handles to anyservice in the network. Thus, a service proxy 472, residing in a client470, that needs to bind to, for example, an RDBMS service can send agetService request to any one of the service locators, the servicelocator 476 or the service locator clone 474, thereby balancing theworkload among the multiple service locator instances. Of course,multiple service locator instances, possibly on different servers, canalso be provided. This would further increase scalability if, forexample, a particular server becomes CPU-bound (i.e., processor bound).

[0166]FIG. 29 also shows multiple service clones residing in differentservers. In particular, FIG. 29 shows a service clone 482 for an RDBMS480 residing in a server 478 and a service clone 488 for an RDBMS 486residing in a server 484. Each service can have a number of serviceclones (e.g., specified by the numClones property of the service). Eachclone of the service is a full instance of a service (e.g., contains aservice object and a pool of workers). Accordingly, additional serviceclones improves throughput by balancing the workload among the serviceclones.

[0167] In some embodiments, the numWorkers property of a servicecontrols the number of workers that can be instantiated for the service.Each worker implements the functionality of the service and thusencapsulates the functionality of the service. For example, a worker forthe DataService implements the DataService interface (e.g., executeSQLor executeStoredProcedure). Thus, the worker for the DataService maydiffer from a worker for another service. Concurrent access to eachworker is controlled by the maxClients property of the worker. Multipleworkers can be instantiated in the same process which allows multipleclients to work concurrently on multiple workers. In particular,multiple workers in the same process means that there are multipleworkers (e.g., multiple worker CORBA objects) instantiated in the sameprocess by the service object. For example, if a worker can only supportone reservation at a time (e.g., the worker.maxClients property is setto 1), and there is only one worker in the process, then only one clientcan perform work in that process at a time. If there are two workers inthe process, then two clients can perform work in that process. If thetwo workers are in two different processes, then two clients can performwork concurrently, but in each process, there is only one client activeat a time.

[0168] Generally, a thread-safe program (code) is a program thatsupports any number of threads in the same program (i.e., a thread is anoperating system term that indicates a thread of control in a process).Thus, if the code of a worker for a service is not thread-safe (i.e.,the worker only supports one client at a time), only one client canperform work on the worker at a time. The service framework of thepresent invention allows multiple clients to perform work on an instanceof a service even though the workers for the service are notthread-safe. For example, multiple workers may be instantiated indifferent processes so that multiple clients can perform work on theservice using these workers. Moreover, multiple workers may beinstantiated in the same process so that multiple clients may performwork on the service using these workers. Accordingly, the serviceframework of the present invention allows for multiple clients toperform work on the service using non-thread-safe workers withoutrequiring the creation of multiple processes.

[0169] As the number of workers increases, the throughput generallyincreases, particularly for workers that are heavily serializedinternally. Generally, serialized internally means that certain sectionsof the code of the worker are called critical sections, which aresections of code that are not thread-safe. Each client accessing aworker represents a thread. Thus, a critical section only supports oneclient at a time. If a worker has a large number of critical sections,or if a significant segment of the code of the worker is a criticalsection, then throughput may suffer, because each client must processthe critical section(s) of the worker one at a time. Thus, if a workeris serialized internally, then each critical section serializes accessto particular code of the worker and thus bottlenecks may reducethroughput. Thus, a worker with no critical sections may have higherthroughput, because multiple clients can access the worker concurrently.Of course, additional workers may be instantiated to support concurrentaccess by additional clients. However, the higher the number of workers,the greater the overhead in the service itself, because the serviceobject must collect the workload from all the workers in order toperform workload balancing of the workers and worker reservationmanagement.

[0170] Further, redundancy through the use of clones as shown in FIG. 29also provides increased fault tolerance in the service framework of thepresent invention. For example, the service framework can isolate andwithstand faults generated by worker implementations, and the serviceframework has a high-availability feature to tolerate downtimes. Inparticular, in some embodiments, the service framework provides forfault isolation. For example, workers (e.g., workers implemented in C++or JAVA) are prone to fault generation, but the faults are isolated sothat the faults do not affect the running (current) configuration. Theworker fault isolation can be accomplished in various ways: each workercan be located in a separate process using the numProcesses property andthe numWorkers property (i.e., an out-of-process worker), and thus afault generated by the worker would be isolated. Thus, because there canbe multiple instances of almost every entity (e.g., service locator, SM,service, worker, etc.), faults within any particular address space canbe tolerated. Hence, a death of a service instance can be toleratedwithout a complete outage if clones of the service are present andregistered with the service locator. Any work performed on the serviceinstance at the time of death may be lost and any dependent work may beaffected, but clients issuing requests to other clones of the servicewill not be subject to any interruptions. Similarly, service and servicelocator instances can be distributed across multiple machines therebyisolating machine or network area faults as well. Accordingly, servicesas well as workers can be launched out of process thereby providing forfault isolation (e.g., distribution among different address spaces), orservices as well as workers can be launched in process thereby providingfor optimal memory usage.

[0171] Further, in some embodiments, the service framework includes anautomatic restart feature that enhances the fault tolerance of theservice framework of the present invention. In particular, every objectfactory can instantly restart a failed object in the event of a failure.For example, if a worker fails in an abnormal manner, the object factoryrestarts the worker immediately. Accordingly, the failure and thesubsequent restart would be apparent only to the service proxy of theclient that had been using the failed worker at the time of the failure.

[0172] In addition, the service framework of the present invention alsoprovides high availability. In particular, the service frameworksupports redundancy at the worker and service object levels (e.g.,worker clones and service clones). In some embodiments, multiple workerscan be configured for a service (e.g., multiple workers can beinstantiated based on the numWorkers property of the service), multipleservice clones can be configured (e.g., defined in a service propertysuch as service.numClones), and multiple service locators can beconfigured. The number of service locator instances in an SM may bedefined by a service locator property such as serviceLocator.numClones(e.g., if the service locator has numClones set to 2, then the SMinstantiates 2 service locator clones).

[0173] Also, in some embodiments, the service framework of the presentinvention also provides rebinding and fault tolerance as illustrated bythe flow diagram of FIG. 30. In particular, in addition to encapsulatingthe complex logic for reserving a worker, the service proxy alsoencapsulates rebinding to a new service and worker upon failure therebyproviding the service framework with an additional level of faulttolerance.

[0174] Accordingly, FIG. 30 is a flow diagram illustrating the faulttolerance operation for when a service object becomes unavailable. Instage 500, the service proxy determines whether an exception has beenraised by the service object (e.g., whether or not the service objecthas failed during the reservation process). In stage 502, the serviceproxy also determines whether or not its currently cached servicelocator handle is valid. If not, the service proxy rebinds to anotherservice locator, if available, in stage 504, and then proceeds to stage506. In stage 506, the service proxy handles the exception raised by theservice object and obtains a new service handle for a different serviceobject from the service locator. Further, the service proxy may includesome transient properties that control the rebinding processes. In someembodiments, the transient properties include a value for maximum rebindattempts and a value for delay in milliseconds before attempting torebind to a service.

[0175]FIG. 31 shows an administrative interface 526 in a server 524 inaccordance with some embodiments of the present invention. A console 520is linked by a network or local connection 522 to the server 524 thatincludes the administrative interface 526. The console 520 allows anoperator to configure the configuration. As discussed above, theconfiguration defines a collection of services and service locatorsalong with the properties of the services and the service locators. Theconfiguration is maintained by the SM. An operator defines theconfiguration from the console 520. Thus, when the SM comes up, it comesup with a predetermined configuration of the services, service locators,and their properties. Once configured, the configured distributed objectnetwork system that includes the service framework of the presentinvention is fault tolerant. Thus, an operator is not required tomonitor the console once a configuration has been configured and thesystem started. In some embodiments, the console 520 provides a centralmanagement console for remotely administering distributed applications(e.g., a console that includes management software written in JAVA forperforming remote management of clients on the global Internet such as astandard browser client).

[0176]FIG. 32 provides an interface of the administrative interface inaccordance with some embodiments of the present invention. Inparticular, the administrative interface of FIG. 32 is written in IDL.The implementation of the service framework derives from theadministrative (admin) layer. In some embodiments, the admin layer canactivate/deactivate an object, and the admin layer can also customizethe object's behavior through the properties of the object. Accordingly,the service object's interface derives from the admin interface, and theservice object's implementation extends the admin layer. For example,the SM uses the service object's admin layer to start and stop theservices and set the properties of the services.

[0177] Although particular embodiments of the present invention havebeen shown and described, it will be obvious to those skilled in the artthat changes and modifications can be made without departing from thepresent invention in its broader aspects and, therefore, the appendedclaims are to encompass within their scope all such changes andmodifications that fall within the true spirit and scope of thisinvention.

What is claimed is:
 1. A distributed object network system comprising: afirst computer; a service residing on the first computer, the serviceproviding access to a limited resource that resides on the firstcomputer; a service framework; wherein the service framework furthercomprises: a first set of computer instructions executed by the firstcomputer, the first set of computer instructions providing access to therequested service by allocating a worker in a worker pool for theservice in response to a service request, wherein the worker executesthe service request, and the first set of computer instructions providesworkload balancing among a plurality of workers in the worker pool forthe service.
 2. The apparatus of claim 1 further comprising: a secondcomputer; a network connecting the second computer to the firstcomputer, wherein the service framework further comprises: a second setof computer instructions executed by the second computer, the second setof computer instructions requesting a service from the first computer.3. The apparatus of claim 2 further comprising: a third set of computerinstructions executed by a third computer, the third set of computerinstructions providing central management and configuration foradministering distributed applications.
 4. The apparatus of claim 2wherein the service framework further comprises: a service locatorexecuted by the first computer, the service locator providing an objectreference to the first set of computer instructions in response to a getservice operation from the second set of computer instructions, whereinthe service locator provides workload balancing among instances of theservice.
 5. The apparatus of claim 4 wherein the service frameworkfurther comprises: a third set of computer instructions executed by thesecond computer, the third set of computer instructions providing anobject reference to the service locator in response to a find serviceoperation from the second set of computer instructions, wherein thethird set of computer instructions provides workload balancing amonginstances of the service locator.
 6. The apparatus of claim 2 whereinthe service framework further comprises: a third set of computerinstructions executed by the first computer, the third set of computerinstructions instantiating the second set of computer instructions andregistering the service with a service locator, wherein the servicelocator is executed by the first computer and provides an objectreference to the first set of computer instructions in response to a getservice operation from the second set of computer instructions.
 7. Theapparatus of claim 2 wherein the second set of computer instructionscomprises methods exported by an interface of the worker.
 8. Theapparatus of claim 2 wherein the second set of computer instructionsfurther comprises: obtaining an object reference to the service locator;obtaining the object reference to the requested service; obtaining areservation on the worker in the worker pool for the requested service;and executing the service request on the worker in the worker pool forthe requested service.
 9. The apparatus of claim 8 wherein the first setof computer instructions further comprises: allocating reservationsamong at least two workers in the worker pool for the service.
 10. Theapparatus of claim 9 wherein the first set of computer instructionsfurther comprises: balancing the workload among the workers in theworker pool by providing at least two queues that each comprise workersthat have various properties, wherein each queue comprises a sub-queuethat comprises workers that have various priorities.
 11. The apparatusof claim 10 wherein the second set of computer instructions furthercomprises: providing a reservation context that comprises a client key,a worker key, and a service key.
 12. The apparatus of claim 11 whereinthe service framework is implemented as CORBA extensions.
 13. Theapparatus of claim 12 wherein the second computer further comprises: aweb browser.
 14. The apparatus of claim 13 wherein the service frameworkprovides session management for a connection between the first set ofcomputer instructions and the second set of computer instructions duringa worker reservation.
 15. The apparatus of claim 14 wherein the secondcomputer further comprises: a JAVA applet executed by the secondcomputer, wherein the second set of computer instructions provides theservice request in Internet Inter-ORB Protocol (IIOP) to the first setof computer instructions.
 16. A computer implemented method forproviding a service framework in a distributed object network system,the method comprising: executing a first set of computer instructions ina first computer, the first set of computer instructions providingaccess to a service by allocating a worker in the worker pool for theservice, wherein the first set of computer instructions providesworkload balancing among a plurality of workers in the worker pool forthe service, and the service provides access to a limited resource thatresides on the first computer; and executing a second set of computerinstructions in a second computer, the second set of computerinstructions encapsulating the operation of performing a service requestfor the second computer, wherein the second computer connects to thefirst computer over a network.
 17. The computer implemented method ofclaim 16 wherein the step of executing the second set of computerinstructions further comprises: finding the requested service using aservice locator, wherein the service locator provides an objectreference to the first set of computer instructions; obtaining anallocated worker in the worker pool for the requested service, whereinthe allocated worker is selected and reserved by the first set ofcomputer instructions, and the allocated worker executes servicerequests; receiving results of the service request from the allocatedworker; and passing the results of the service request to the secondcomputer.
 18. The computer implemented method of claim 16 furthercomprising: executing a third set of computer instructions in the secondcomputer, the third set of computer instructions providing workloadbalancing among instances of the service locator, wherein the servicelocator provides an object reference to the first set of computerinstructions.
 19. The computer implemented method of claim 16 whereinthe step of executing the first set of computer instructions furthercomprises: allocating reservations among at least two workers in theworker pool for the service based on worker statistics.
 20. The computerimplemented method of claim 19 wherein the step of executing the firstset of computer instructions further comprises: providing a loadbalancing manager that comprises at least two queues that each compriseworkers that have various properties, wherein each queue comprises asub-queue that comprises workers that have various priorities.
 21. Acomputer-readable medium comprising software for a service framework fora distributed object network system, the service framework softwarecomprising: a set of objects, the set of objects providing access to aservice, the service providing access to a limited resource residing ona first computer, wherein the set of objects further comprises: aplurality of workers in a worker pool for the service; and a loadbalancing manager that balances workloads among the plurality of workersin the worker pool for the service.
 22. The computer-readable medium asin claim 21 wherein the service framework software further comprises: aservice proxy object, the service proxy object encapsulating, for asecond computer, the operation of a service request to the set ofobjects, wherein the second computer connects to the first computer overa network.
 23. The computer-readable medium as in claim 21 wherein theservice framework software further comprises: a service proxy locatorobject, the service proxy locator object providing workload balancingamong instances of a service locator, wherein the service locatorprovides an object reference to the set of objects and balancesworkloads among instances of the requested service.
 24. Computer datasignals embodied in a carrier wave comprising: an allocate workerinterface in Internet Inter-ORB Protocol (IIOP) from a service proxyobject to a service object, the service object residing on a firstcomputer and providing access to a limited resource on the firstcomputer, and the service proxy object residing on a second computer andencapsulating the operation of requesting a service, wherein theallocate worker interface comprises service properties and a reservationcontext; and an execute request interface in IIOP from the service proxyobject to an allocated worker object that resides on the first computer,wherein the allocated worker object executes service requests.
 25. Thecomputer data signals as in claim 24 wherein the allocate workerinterface in IIOP further comprises: a worker hint in the reservationcontext, wherein the worker hint comprises an object reference of aworker that was previously reserved by the second computer.